Scalability Analysis of Declustering Methods for Multidimensional Range Queries

نویسندگان

  • Bongki Moon
  • Joel H. Saltz
چکیده

Efficient storage and retrieval of multiattribute data sets has become one of the essential requirements for many data-intensive applications. The Cartesian product file has been known as an effective multiattribute file structure for partial-match and best-match queries. Several heuristic methods have been developed to decluster Cartesian product files across multiple disks to obtain high performance for disk accesses. Although the scalability of the declustering methods becomes increasingly important for systems equipped with a large number of disks, no analytic studies have been done so far. In this paper, we derive formulas describing the scalability of two popular declustering methods¦Disk Modulo and Fieldwise Xor¦for range queries, which are the most common type of queries. These formulas disclose the limited scalability of the declustering methods, and this is corroborated by extensive simulation experiments. From the practical point of view, the formulas given in this paper provide a simple measure that can be used to predict the response time of a given range query and to guide the selection of a declustering method under various conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalability Analysis of Declustering Methods for Cartesian Product Files

Efficient storage and retrieval of multi-attribute datasets has become one of the essential requirements for many data-intensive applications. The Cartesian product file has been known as an effective multi-attribute file structure for partial-match and best-match queries. Several heuristic methods have been developed to decluster Cartesian product files over multiple disks to obtain high perfo...

متن کامل

Latin Hypercubes: A Class of Multidimensional Declustering Techniques

The I/O subsystem is widely accepted as one of the principal bottlenecks for high performance parallel databases systems. The emergence of parallel I/O architectures has made the problem of data declustering, i.e. fragmenting a le of records and allocating the pieces to different disks, one of prime importance. This is evident from the growing activity in this area. In this study we focus only ...

متن کامل

Study of Scalable Declustering Algorithms for Parallel Grid Files

Efficient storage and retrieval of large multidimensional datasets is an important concern for large-scale scientific computations such as long-running time-dependent simulations which periodically generate snapshots of the state. The main challenge for efficiently handling such datasets is to minimize response time for multidimensional range queries. The grid file is one of the well known acce...

متن کامل

Declustering Using Golden Ratio Sequences

In this paper we propose a new data declustering scheme for range queries. Our scheme is based on Golden Ratio Sequences (GRS), which have found applications in broadcast disks, hashing, packet routing, etc. We show by analysis and simulation that GRS is nearly the best possible scheme for 2-dimensional range queries. Speciically, it is the best possible scheme when the number of disks (M) is a...

متن کامل

Declustering Using Fractals

We propose a method to achieve declustering for cartesian product les on M units. The focus is on range queries, as opposed to partial match queries that older declustering methods have examined. Our method uses a distance-preserving mapping, namely, the Hilbert curve, to impose a linear ordering on the multidimensional points (buckets); then, it traverses the buckets according to this ordering...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Knowl. Data Eng.

دوره 10  شماره 

صفحات  -

تاریخ انتشار 1998